Supertagging: An Approach To Almost Parsing
نویسندگان
چکیده
In this paper, we have proposed novel methods for robust parsing that integrate the exibility of linguistically motivated lexical descriptions with the robustness of statistical techniques. Our thesis is that the computation of linguistic structure can be localized if lexical items are associated with rich descriptions (Supertags) that impose complex constraints in a local context. The supertags are designed such that only those elements on which the lexical item imposes constraints appear within a given supertag. Further, each lexical item is associated with as many supertags as the number of di erent syntactic contexts in which the lexical item can appear. This makes the number of di erent descriptions for each lexical item much larger, than when the descriptions are less complex; thus increasing the local ambiguity for a parser. But this local ambiguity can be resolved by using statistical distributions of supertag co-occurrences collected from a corpus of parses. We have explored these ideas in the context of Lexicalized Tree-Adjoining Grammar (LTAG) framework. The supertags in LTAG combine both phrase structure information and dependency information in a single representation. Supertag disambiguation results in a representation that is e ectively a parse (almost parse), and the parser needs `only' combine the individual supertags. This method of parsing can also be used to parse sentence fragments such as in spoken utterances where the disambiguated supertag sequence may not combine into a single structure.
منابع مشابه
TOWARDS EFFICIENT STATISTICAL PARSING USING LEXICALIZED GRAMMATICAL INFORMATION by
For a long time, the goal of wide-coverage natural language parsers had remained elusive. Much progress has been made recently, however, with the development of lexicalized statistical models of natural language parsing. Although lexicalized tree adjoining grammar (TAG) is a lexicalized grammatical formalism whose development predates these recent advances, its application in lexicalized statis...
متن کاملA Dynamic Window Neural Network for CCG Supertagging
Combinatory Category Grammar (CCG) supertagging is a task to assign lexical categories to each word in a sentence. Almost all previous methods use fixed context window sizes to encode input tokens. However, it is obvious that different tags usually rely on different context window sizes. This motivates us to build a supertagger with a dynamic window approach, which can be treated as an attentio...
متن کاملStacking or Supertagging for Dependency Parsing - What's the Difference?
Supertagging was recently proposed to provide syntactic features for statistical dependency parsing, contrary to its traditional use as a disambiguation step. We conduct a broad range of controlled experiments to compare this specific application of supertagging with another method for providing syntactic features, namely stacking. We find that in this context supertagging is a form of stacking...
متن کاملSupertagging: Introduction, learning, and application
Supertagging is an approach originally developed by Bangalore and Joshi (1999) to improve the parsing efficiency. In the beginning , the scholars used small training datasets and somewhat na¨ıve smoothing techniques to learn the probability distributions of supertags. Since its inception, the applicability of Supertags has been explored for TAG (tree-adjoining grammar) formalism as well as othe...
متن کاملHPSG Supertagging: A Sequence Labeling View
Supertagging is a widely used speed-up technique for deep parsing. In another aspect, supertagging has been exploited in other NLP tasks than parsing for utilizing the rich syntactic information given by the supertags. However, the performance of supertagger is still a bottleneck for such applications. In this paper, we investigated the relationship between supertagging and parsing, not just to...
متن کاملCCG Supertagging with a Recurrent Neural Network
Recent work on supertagging using a feedforward neural network achieved significant improvements for CCG supertagging and parsing (Lewis and Steedman, 2014). However, their architecture is limited to considering local contexts and does not naturally model sequences of arbitrary length. In this paper, we show how directly capturing sequence information using a recurrent neural network leads to f...
متن کامل